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Description 



[0001] This invention relates to methods of detecting and cloning of individual mRNAs. 

[0002] The activities of genes in cells are reflected in the kinds and quantities of their mRNA and protein species. 

5 Gene expression is crucial for processes such as aging, development, differentiation, metabolite production, progres- 
sion of the cell cycle, and infectious or genetic or other disease states. Identification of the expressed mRNAs will be 
valuable for the elucidation of their molecular mechanisms, and for applications to the above processes. 
[0003] Mammalian cells contain approximately 1 5,000 different mRNA sequences, however, each mRNA sequence 
is present at a different frequency within the cell. Generally, mRNAs are expressed at one of three levels. A few "abun- 

10 dant" mRNAs are present at about 10,000 copies per cell, about 3,000-4,000 "intermediate" mRNAs are present at 
300-500 copies per cell, and about 11 ,000 "low-abundance" or "rare" mRNAs are present at approximately 15 copies 
per cell. The numerous genes that are represented by intermediate and low frequencies of their mRNAs can be cloned 
by a variety of well established techniques (see for example Sambrook et a/., 1 989, Molecular Cloning: A Laboratory 
Manual, Second Edition, Cold Spring Harbor Press, pp. 8.6-8.35). 

15 [0004] If some knowledge of the gene sequence or protein is had, several direct cloning methods are available. 
However, if the identity of the desired gene is unknown one must be able to select or enrich for the desired gene product 
in order to identify the "unknown" gene without expending large amounts of time and resources. 
[0005] The identification of unknown genes can often involve the use of subtractive or differential hybridization tech- 
niques. Subtractive hybridization techniques rely upon the use of very closely related cell populations, such that dif- 

20 ferences in gene expression will primarily represent the gene(s) of interest. A key element of the subtractive hybridi- 
zation technique is the construction of a comprehensive complementary-DNA ("cDNA") library. 
[0006] The construction of a comprehensive cDNA library is now a fairly routine procedure. PolyA mRNA is prepared 
from the desired cells and the first strand of the cDNA is synthesized using RNA-dependent DNA polymerase ("reverse 
transcriptase") and an oligodeoxynucleotide primer of 12 to 18 thymidine residues. The second stand of the cDNA is 

25 synthesized by one of several methods, the more efficient of which are commonly known as "replacement synthesis" 
and "primed synthesis". 

[0007] Replacement synthesis involves the use of ribonuclease H ("RNAase H"), which cleaves the phosphodiester 
backbone of RNA that is in a RNAiDNA hybrid leaving a 3' hydroxyl and a 5' phosphate, to produce nicks and gaps in 
the mRNA strand, creating a series of RNA primers that are used by E. coliQNA polymerase l,orits"Klenow" fragment, 
30 to synthesize the second strand of the cDNA. This reaction is very efficient; however, the cDNAs produced most often 
lack the 5' terminus of the mRNA sequence. 

[0008] Primed synthesis to generate the second cDNA strand is a general name for several methods which are more 
difficult than replacement synthesis yet clone the 5' terminal sequences with high efficiency. In general, after the syn- 
thesis of the first cDNA strand, the 3' end of the cDNA strand is extended with terminal transferase, an enzyme which 
35 adds a homopolymeric "tail" of deoxynucleotides, most commonly deoxycytidylate. This tail is then hybridized to a 
primer of oligodeoxyguanidylate or a synthetic fragment of DNA with an deoxyguanidylate tail and the second strand 
of the cDNA is synthesized using a DNA-dependent DNA polymerase. 

[0009] The primed synthesis method is effective, but the method is laborious, and all resultant cDNA clones have a 
tract of deoxyguanidylate immediately upstream of the mRNA sequence. This deoxyguanidylate tract can interfere with 
40 transcription of the DNA in vitro or in vivo and can interfere with the sequencing of the clones by the Sanger dideoxy- 
nucleotide sequencing method. 

[0010] Once both cDNA strands have been synthesized, the cDNA library is constructed by cloning the cDNAs into 
an appropriate plasmid or viral vector. In practice this can be done by directly ligating the blunt ends of the cDNAs into 
a vector which has been digested by a restriction endonuclease to produce blunt ends. Blunt end ligations are very 
45 inefficient, however, and this is not a common method of choice. A generally used method involves adding synthetic 
linkers or adapters containing restriction endonuclease recognition sequences to the ends of the cDNAs. The cDNAs 
can then be cloned into the desired vector at a greater efficiency. 

[0011] Once a comprehensive cDNA library is constructed from a cell line, desired genes can be identified with the 
assistance of subtractive hybridization (see for example Sargent T.D., 1987, Meth, EnzymoL, Vol. 152, pp. 423-432; 

so Lee et a/., 1991 , Proc. Nati. Acad. Sci., USA, Vol. 88, pp. 2825-2830). A general method for subtractive hybridization 
is as follows. The complementary strand of the cDNA is synthesized and radiolabeled. This single strand of cDNA can 
be made from polyA mRNA or from the existing cDNA library. The radiolabeled cDNA is hybridized to a large excess 
of mRNA from a closely related cell population. After hybridization the cDNA:mRNA hybrids are removed from the 
solution by chromatography on a hydroxylapatite column. The remaining "subtracted" radiolabeled cDNA can then be 

55 used to screen a cDNA or genomic DNA library of the same cell population. 

[0012] Subtractive hybridization removes the majority of the genes expressed in both cell populations and thus en- 
riches for genes which are present only in the desired cell population. However, if the expression of a particular mRNA 
sequence is only a few times more abundant in the desired cell population than the subtractive population it may not 
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be possible to isolate the gene by subtractive hybridization. 
[0013] Proc. Natl. Acad. Scie. USA Vol. 86, pp. 5673-5677, August 1989 Biochemistry discloses one-sided polymer- 
ase chain reaction: The amplification of cDNA a rapid technique, based on the polymerase chain reaction (PCR), for 
the direct targeting, enhancement, and sequencing of previously uncharacterized cDNAs. This method is not limited 
5 to previously sequenced transcripts, since it requires only two adjacent or partially overtapping specific primers from 
only one side of the region to be amplified. These primers can be located anywhere within the message. The specific 
primers are used in conjunction with nonspecific primers targeted either to the poly(A) + region of the message or to 
an enzymatically synthesized d(A) tail. 

[0014] Pairwise combinations of specific and general primers allow for the amplification of regions both 3' and 5' to 

10 the point of entry into the message. The amplified PCR products can be cloned, sequenced directly by genomic se- 
quencing, or labeled for sequencing by amplifying with a radioactive primer. We illustrate the power of this approach 
by deriving the cDNA sequences for the skeletal muscle a-tropomyosins of European common frog {Rana temporaria) 
and zebrafish (Brachydanio rerio) using only 300 ng of a total poly(A) + preparation. In these examples, we gained initial 
entry into the tropomyosin messages by using heterologous primers (to conserved regions) derived from the rat skeletal 

15 muscle oc-tropomyosin sequence. The frog and zebrafish sequences are used in an analysis of tropomyosin evolution 
across the vertebrate phylogenetic spectrum. The results underscore the conservative nature of the tropomyosin mol- 
ecule and support the notion of a constrained heptapeptide unit as the fundamental structural motif of tropomyosin. 
[001 5] Nucleic Acids Research, Vol. 1 9, No. 7, discloses efficient double stranded sequencing of cDNA clones con- 
taining long poly(A) tails using anchored poly(DT) primers. Sequencing double stranded DNA templates has become 

20 a common and efficient procedure (1 ) for rapidly obtaining sequence data while avoiding preparation of single stranded 
DNA. Here we report the applicability of this procedure to sequencing cDNA clones containing long stretches of poly 
(A). Double stranded templates of cDNAs containing long poly(A) tracts are difficult to sequence with vector primers 
(e.g. universal M13) which anneal downstream of the poly(A) tail. Sequencing with these primers results in a long poly 
(T) ladder followed by a sequence which is difficult to read (Fig. 1). In an attempt to solve this problem we synthesized 

25 three primers which contain (dT) 17 and either (dA) or (dC) or (dG) at the 3' end. We reasoned that the presence of 
these three bases at the 3' end would 'anchor* the primers at the upstream end of the poly(A) tail and allow sequencing 
of the region immediately upstream of the poly(A) region. 

[0016] Anchored primers were synthesized on an Applied Biosystems (ABI) 391 DNA synthesizer and used after 
purification on Oligonucleotide Purification Cartridges (ABI). For sequencing with anchored primers, 5-1 0 \ig of plasmid 

30 DNA was denatured in a total volume of 50 u.l containing 0.2 M sodium hydroxide and 0.16 mM EDTA by incubation 
at 65°C for 1 0 minutes. The three poly(dT) anchored primers (2 pmol of each) were added and the mixture immediately 
placed on ice. The solution was then neutralized by the addition of 5 uJ of 5 M ammonium acetate pH 7.0. The DNA 
was precipitated by addition of 150 u.l of cold 95% ethanol and the pellet washed twice with cold 70% ethanol. The 
pellet was dried for 5 minutes and then resuspended in 1 x sequencing buffer (1 x = 40 mM Tris-HCI pH 7.5, 20 mM 

35 MgCI, 50 mM NaCI). Primers were annealed by heating the solution for 2 minutes at 65°C followed by slow cooling to 
room temperature. Sequencing reactions, using modified T7 DNA polymerase (Sequenase, United States Biochenu- 
cals). were then carried out using [ 32 P]a-dATP (> 1000 Ci/mmole) according to the protocol supplied with the Seque- 
nase kit. Under these conditions over 300 bp of readable sequence could be obtained (Fig. 1). We have applied this 
approach to several other poly(A)-containing cDNA clones with similar results. Sequencing of the opposite strand of 

40 these cDNAs using insert-specific primers verified that the sequences obtained with the anchored primers occurred 
directly upstream of the poly(A) region (data not shown). 

[001 7] The ability to directly obtain sequence immediately upstream from the poly(A) tail of cDNAs, as demonstrated 
here, should be of particular importance to large scale efforts to gene sequence-tagged sites (STSs) (2) from cDNAs (3). 
[0018] Nucleic Acids Research, Vol. 19, No. 13 3747 discloses a novel 3* extension technique using random primers 
45 in RNA-PCT. 

[0019] In order to obtain sequence 3' to a partial ~2 kb titin seqcence of the >21 kb titin mRNA (1) that was too 
distant from the poly A tail for 3' RACE methodologies (2). RNA-PCR (3, 4) was done using a primer containing a 
random hexamer at its 3' end. Four u.g of rabbit cardiac muscle total RNA (5) in 25 u.L were reverse transcribed per 
BRL's recommendations using 100 ng RT primer (Figure 1 ) and 200 U MLV reverse transcriptase (BRL). After RNAse 

50 h digestion, 10 ul was used for PCR in 100 \lL using primers complementary to either known titin sequence (Figure 
1, TS 1) or the RT primer (Figure 1, Y primer), and 30 cycles of 93°C-45 sec, 45°C-1.5 min, 72°C-3.0 min. Although 
defined fragments from 100-1000 bp were observed after the first PCR (Figure 2a) fragments only, <700 bp purified 
by Geneclean (Bio 101) re-amplified during the second PCR using primers complementary to the RT primer (Figure 
1. X primer Containing a Sail site) or the known titin sequence (Figure 1, TS2 containing a Notl site). Lower or higher 

55 concentrations of RT primer resulted in no or very small amplification products, respectively. Only the random hexamer 
part of the RT primer initiated reverse transcription from 6-bp sequences of titin mRNA that had at least 50% G/C 
content. 

[0020] Final amplification products (Figure 2b) were sequenced by dideoxy chain termination methods using Seque- 
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nase (US Biochemical) after digestion with Notl and Sail restriction enzymes and ligation into pBluescript (Stratagene). 
Two clones were sequenced because of the possibility of infidelity of the Taq polymerase. After repeating this entire 
procedure three times using new sets sof titin-specific primers, the titin sequence was extended 1109-bp (EMBL ac- 
cession no. X59596). The second amplification step could perhaps be omitted or done asymmetrically for sequencing. 
5 [0021] This technique should also be applicable to 5' exiensions of cDNA clones. Perhaps a modification of it could 
be applied to extensions of known genomic DNA in either direction. 

[0022] The Journal of Cell Biology, Vol. 115, No. 4, November 1991, 887-903 discloses an analysis of vertebrate 
mRNA sequences: intimations of translational control. 

[0023] Five structural features in mRNAs have been found to contribute to the fidelity and efficiency of initiation by 

10 eukaryotic ribosomes. Scrutiny of vertebrate cDNA sequences in light of these criteria reveals a set of transcripts- 
encoding oncoproteins, growth factors, transcription factors, and other regulatory proteins-that seem designed to be 
translated poorly. Thus, throttling at the level of translation may be a critical component of gene regulation in vertebrates. 
An alternative interpretation is that some (perhaps many) cDNAs with encumbered 5' noncoding sequences represent 
mRNA precursors, which would imply extensive regulation at a posttranscriptional step that precedes translation. 

15 [0024] We have discovered a method for identifying, isolating and cloning mRNAs as cDNAs using a polymerase 
amplification method that employs at least two oligodeoxynucleotide primers. In one approach, the first primer contains 
sequence capable of hybridizing to a site including sequence that is immediately upstream of the first A ribonucleotide 
of the mRNA's poIyA tail and the second primer contains arbitrary sequence. In another approach, the first primer 
contains sequence capable of hybridizing to a site including the mRNA's polyA signal sequence and the second primer 

20 contains arbitrary sequence. In another approach, the first primer contains arbitrary sequence and the second primer 
contains sequence capable of hybridizing to a site including the mRNA's Kozak sequence. In another approach, the 
first primer contains a sequence that is substantially complementary to the sequence of a mRNA having a known 
sequence and the second primer contains arbitrary sequence. In another approach, the first primer contains arbitiacy 
sequence and the second primer contains sequence that is substantially identical to the sequence of a mRNA having 

25 a known sequence. The first primer is used as a primer for reverse transcription of the mRNA and the resultant cDNA 
is amplified with a polymerase using both the first and second primers as a primer set. 

[0025] Using this method with different pairs of the alterable primers, virtually any or all of the mRNAs from any cell 
type or any stage of the cell cycle, including very low abundance mRNAs, can be identified and isolated. Additionally 
a comparison of the mRNAs from closely related cells, which may be for example at different stages of development 
30 or different stages of the cell cycle, can show which of the mRNAs are constitutively expressed and which are differ- 
entially expressed, and their respective frequencies of expression. 

[0026] The "first primer" or "first oligodeoxynucleotide" as used herein is defined as being the oligodeoxynucleotide 
primer that is used for the reverse transcription of the mRNA to make the first cDNA strand, and then is also used for 
amplification of the cDNA. The first primer can also be referred to as the 3' primer, as this primer will hybridize to the 
35 mRNA and will define the 3* end of the first cDNA strand. The "second primer" as used herein is defined as being the 
oligodeoxynucleotide primer that is used to make the second cDNA strand, and is also used for the amplification of 
the cDNA. The second primer may also be referred to as the 5* primer, as this primer will hybridize to the first cDNA 
strand and will define the 5' end of the second cDNA strand. 

[0027] The "arbitrary" sequence of an oligodeoxynucleotide primer as used herein is defined as being based upon 
40 or subject to individual judgement or discretion. In some instances, the arbitrary sequence can be entirely random or 
partly random for one or more bases. In other instances the arbitrary sequence can be selected to contain a specific 
ratio of each deoxynucleotide, for example approximately equal proportions of each deoxynucleotide or predominantly 
one deoxynucleotide, or to not contain a specific deoxynucleotide. The arbitrary sequence can be selected to contain, 
or not to contain, a recognition site for specific restriction endonuclease. The arbitrary sequence can be selected to 
45 either contain a sequence that is substantially identical (at least 50 homologous) to a mRNA of known sequence or to 
not contain sequence from a mRNA of known sequence. 

[0028] An oligodeoxynuceotide primer can be either "complementary" to a sequence or "substantially identical" to a 
sequence. As defined herein, a complementary oligodeoxynucleotide primer is a primer that contains a sequence which 
will hybridize to an mRNA, that is the bases are complementary to each other and a reverse transcriptase will be able 

50 to extend the primer to form a cDNA strand of the mRNA. As defined herein, a substantially identical primer is a primer 
that contains sequence which is the same as the sequence of an mRNA, that is greater than 50% identical, and the 
primer has the same orientation as an mRNA thus it will not hybridize to, or complement, an mRNA but such a primer 
can be used to hybridize to the first cDNA strand and can be extended by a polymerase to generate the second cDNA 
strand. The terms of art "hybridization" or "hybridize", as used herein, are defined to be the base pairing of an oligo- 

55 deoxynucleotide primer with a mRNA or cDNA strand. The "conditions under which" an oligodeoxynucleotide hybridizes 
with an mRNA or a cDNA, as used herein, is defined to be temperature and buffer conditions (that are described later) 
under which the base pairing of the oligodeoxynucleotide primer with either an mRNA or a cDNA occurs and only a 
few mismatches (one or two) of the base pairing are permissible. 
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[0029] An oligonucleotide primer can contain a sequence that is known to be a "consensus sequence" of an mRNA 
of known sequence. As defined herein, a "consensus sequence" is a sequence that has been found in a gene family 
of proteins having a similar function or similar properties. The use of a primer that includes a consensus sequence 
may result in the cloning of additional members of a desired gene family. 
5 [0030] The "preferred length" of an oligodeoxynucleotide primer, as used herein, is determined from the desired 
specificity of annealing and the number of oligodeoxynucleotides having the desired specificity that are required to 
hybridize to all the mRNAs in a cell. An oligodeoxynucleotide primer of 20 nucleotides is more specific than an oligo- 
deoxynucleotide primer of 10 nucleotides; however, addition of each random nucleotide to an oligodeoxynucleotide 
primer increases by four the number of oligodeoxynucleotide primers required in order to hybridize to every mRNA in 



[0031] In one aspect, in general, the invention features a method for identifying and isolating mRNAs by priming a 
preparation of mRNA for reverse transcription with a first oligodeoxynucleotide primer that contains sequence capable 
of hybridizing to a site including sequence that is immediately upstream of the first A ribonucleotide of the mRNA's 
polyA tail, and amplifying the cDNA by a polymerase amplification method using the first primer and a second oligo- 

15 deoxynucleotide primer, for example a primer having arbitrary sequence, as a primer set. 

[0032] In preferred embodiments, the first primer contains at least 1 nucleotide at the 3* end of the oligodeoxynucle- 
otide that can hybridize to an mRNA sequence that is immediately upstream of the polyA tail, and contains at least 11 
nucleotides at the 5* end that will hybridize to the polyA tail. The entire 3' oligodeoxynucleotide is preferably at least 
13 nucleotides in length, and can be up to 20 nucleotides in length. 

20 [0033] Most preferably, the first primer contains 2 nucleotides at the 3' end of the oligodeoxynucleotide that can 
hybridize to an mRNA sequence that is immediately upstream of the polyA tail. Preferably, the 2 polyA-non-comple- 
mentary nucleotides are of the sequence VN, where V is deoxyadenylate ("dA"), deoxyguanylate ("dG"), or deoxycyti- 
dylate ("dC"), and N, the 3' terminal nucleotide, is dA, dG, dC, or deoxythymidylate ("dT"). Thus the sequence of a 
preferred first primer is 5-TTTTTTTTTTTVN [Seq. ID. No. 1]. The use of 2 nucleotides can provide accurate positioning 

25 of the first primer at the junction between the mRNA and its polyA tail, as the properly aligned oligodeoxynucleotide: 
mRNA hybrids are more stable than improperly aligned hybrids, and thus the properly aligned hybrids will form and 
remain hybridized at higher temperatures. In preferred applications, the mRNA sample will be divided into at least 
twelve aliquots and one of the 12 possible VN sequences of the first primer will be used in each reaction to prime the 
reverse transcription of the mRNA. The use of an oligodeoxynucleotide with a single sequence will reduce the number 

30 of mRNAs to be analyzed in each sample by binding to a subset of the mRNAs, statistically 1/1 2th, thus simplifying 
the identification of the mRNAs in each sample. ~x 

[0034] In some embodiments, the 3' end of the first primer can have 1 nucleotide that can hybridize to an mRNA 
sequence that is immediately upstream of the polyA tail, and 1 2 nucleotides at the 5' end that will hybridize to the polyA 
tail, thus the primer will have the sequence 5-TTTTTTTTTTTTV [Seq. ID. No. 2]. The use of a single non-polyA- 
35 complementary deoxynucleotide would decrease the number of oligodeoxynucleotides that are required to identify 
every mRNA to 3, however, the use of a single nucleotide to position the annealing of primer to the junction of the 
mRNA sequence and the polyA tail may result in a significant loss of specificity of the annealing and 2 non-polyA- 
complementary nucleotides are preferred. 

[0035] In some embodiments, the 3' end of the first primer can have 3 or more nucleotides that can hybridize to an 

40 mRNA sequence that is immediately upstream of the polyA tail. The addition of each nucleotide to the 3' end will further 
increase the stability of properly aligned hybrids, and the sequence to hybridize to the polyA tail can be decreased by 
one nucleotide for each additional non-polyA-complementary nucleotide added. The use of such a first primer may not 
be practical for rapid screening of the mRNAs contained within a given cell line, as the use of a first primer with more 
than 2 nucleotides that hybridize to the mRNA immediately upstream of the polyA tail significantly increases the number 

45 of oligodeoxynucleotides required to identify every mRNA. For instance, the primer 5'-TTTTTTTTTTVNN [Seq. ID. No. 
3] would require the use of 48 separate first primers in order to bind to every mRNA, and would significantly increase 
the number of reactions required to screen the mRNA from a given cell line. The use of oligodeoxynucleotides with a 
single random nucleotide in one position as a group of four can circumvent the problem of needing to set up 48 separate 
reactions in order to identify every mRNA. However as the non-polyA-complementary sequence became longer, it 

50 would quickly become necessary to increase the number of reactions required to identify every mRNA. 

[0036] In preferred embodiments, the second primer is of arbitrary sequence and is at least 9 nucleotides in length. 
Preferably the second primer is at most 13 nucleotides in length and can be up to 20 nucleotides in length. 
[0037] In another aspect, in general, the invention features a method for preparing and isolating mRNAs by priming 
a preparation of mRNA for reverse transcription with a first primer that contains a sequence capable of hybridizing to 

55 the polyadenylation signal sequence and at least 4 nucleotides that are positioned 5', or 3', or both of the polyadenylation 
signal sequence; this entire first primer is preferably at least 1 0 nucleotides in length, and can be up to 20 nucleotides 
in length. In one preferred embodiment the sequence 5-NNTTTATTNN [Seq. ID. No. 4] can be chosen such that the 
sequence is 5'-GCMITATTNC [Seq. ID. No. 5], and the four resultant primers are used together in a single reaction for 
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a cell. 
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the priming of the mRNA for reverse transcription. Once the first cDNA strand has been formed by reverse transcription 
then the first primer can be used with a second primer, for example and arbitrary sequence primer, for the amplification 
of the cDNA. 

[0038] In one aspect, in general, the invention featums a method for identifying and isolating mRNAs by priming a 
5 preparation of mRNA for reverse transcription with a first oligodeoxynucleotide primer to generate a first cDNA strand, 
and priming the preparation of the second cDNA strand with a second primer that contains sequence substantially 
identical to the Kozak sequence of mRNA, and amplifying the cDNA by a polymerase amplification method using the 
first and second primers as a primer set. 

[0039] In preferred embodiments, the first and second primers are at least 9 deoxynucleotides in length, and are at 
10 most 13 nucleotides in length, and can be up to 20 nucleotides in length. Most preferably the first and second primers 
are 10 deoxynucleotides in length. 

[0040] In preferred embodiments the sequence of the first primer is selected at random, or the first primer contains 
a selected arbitrary sequence, or the first primer contains a restriction endonuclease recognition sequence. 
[0041] In preferred embodiments the sequence of the second primer that contains sequence substantially identical 
15 to the Kozak sequence of mRNA has the sequence NNNANNATGN [Seq. ID No. 6], or has the sequence NNNAN- 
NATGG [Seq. ID No. 7]. Where N is any of the four deoxynucleotides. Preferably, the second primer has the sequence 
GCCACCATGG [Seq. ID No. 8]. In some embodiments the first primer may further include a restriction endonuclease 
recognition sequence that is added to either the 5' or 3' end of the primer increasing the length of the primer by at least 
5 nucleotides. 

20 [0042] In another aspect, in general, the invention features a method for identifying and isolating mRNAs by priming 
a preparation of mRNA for reverse transcription with a first oligodeoxynucleotide primer that contains sequence that 
is substantially complementary to the sequence of a mRNA having a known sequence, and priming the preparation of 
the second cDNA strand with a second primer and, amplifying the cDNA by a polymerase amplification method using 
the first and second primers as a primer set. 

25 [0043] In preferred embodiments, the first and second primers are at least 9 deoxynucleotides in length, and are at 
most 13 nucleotides in length, and can be up to 20 nucleotides in length. Most preferably the first and second primers 
are 10 deoxynucleotides in length. 

[0044] In preferred embodiments the sequence of the first primer further includes a restriction endonuclease se- 
quence, which may be included within the preferred 10 nucleotides of the primer or may be added to either the 3' or 

30 5' end of the primer increasing the length of the oligodeoxynucleotide primer by at least 5 nucleotides. 

[0045] In preferred embodiments the sequence of the second primer is selected at random, or the second primer 
contains a selected arbitrary sequence, or the second primer contains a restriction endonuclease recognition sequence. 
[0046] In another aspect, in general, the invention features a method for identifying and isolating mRNAs by priming 
a preparation of mRNA for reverse transcription with a first oligodeoxynucleotide primer, and priming the preparation 

35 of the second cDNA strand with a second primer that contains sequence that is substantially identical to the sequence 
of a mRNA having a known sequence and, amplifying the cDNA by a polymerase amplification method using the first 
and second primers as a primer set. 

[0047] In preferred embodiments, the first and second primers are at least 9 deoxynucleotides in length, and are at 
most 13 nucleotides in length, and can be up to 20 nucleotides in length. Most preferably the first and second primers 

40 are 10 deoxynucleotides in length. 

[0048] In preferred embodiments the sequence of the first primer is selected at random, or the first primer contains 
a selected arbitrary sequence, or the first primer contains a restriction endonuclease recognition sequence. 
[0049] In preferred embodiments the sequence of the second primer having a sequence that is substantially com- 
plementary to the sequence of an mRNA having a known sequence further includes a restriction endonuclease se- 

45 quence, which may be included within the preferred 10 nucleotides of the primer or may be added to either the 3' or 
5' end of the primer increasing the length of the oligodeoxynucleotide primer by at least 5 nucleotides. 
[0050] In another aspect, in general, the invention features a method for identifying and isolating mRNAs by priming 
a preparation of mRNA for reverse transcription with a first oligodeoxynucleotide primer that contains sequence that 
is substantially complementary to the sequence of a mRNA having a known sequence, and priming the preparation of 

50 the second cDNA strand with a second primer that contains sequence that is substantially identical to the Kozak se- 
quence of mRNA, and amplifying the cDNA by a polymerase amplification method using the first and second primers 
as a primer set. 

[0051] In preferred embodiments, the first and second primers are at least 9 deoxynucleotides in length, and are at 
most 13 nucleotides in length, and can be up to 20 nucleotides in length. Most preferably the first and second primers 
55 are 10 deoxynucleotides in length. 

[0052] In some preferred embodiments of each of the general aspects of the invention, the amplified cDNAs are 
separated and then the desired cDNAs are reamplified using a polymerase amplification reaction and the first and 
second oligodeoxynucleotide primers. 
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[0053] In preferred embodiments of each of the general aspects of the invention, a set of first and second oligode- 
oxynucleotide primers can be used, consisting of more than one of each primer. In some embodiments more than one 
of the first primer will be included in the reverse transcription reaction and more than one each of the first and second 
primers will be included in the amplification reactions. The use of more than one of each primer will increase the number 

s of mRNAs identified in each reaction, and the total number of primers to be used will be determined based upon the 
desired method of separating the cDNAs such that it remains possible to fully isolate each individual cDNA. In preferred 
embodiments a few hundred cDNAs can be isolated and identified using denaturing polyacrylamide gel electrophoresis. 
[0054] The method according to the invention is a significant advance over current cloning techniques that utilize 
subtractive hybridization. In one aspect, the method according to the invention enables the genes which are altered 

10 in their frequency of expression, as well as of mRNAs which are constitutively and differentially expressed, to be iden- 
tified by simple visual inspection and isolated. In another aspect the method according to the invention provides specific 
oligodeoxynucleotide primers for amplification of the desired mRNA as cDNA and makes unnecessary an intermediary 
step of adding a homopolymeric tail to the first cDNA strand for priming of the second cDN A strand and thereby avoiding 
any interference from the homopolymeric tail with subsequent analysis of the isolated gene and its product. In another 

15 aspect the method according to the invention allows the cloning and sequencing of selected mRNAs, so that the in- 
vestigator may determine the relative desirability of the gene prior to screening a comprehensive cDNA library for the 
full length gene product. 

Description of the Preferred Embodiments 

20 



[0055] Fig. 1 is a schematic representation of the method according to the invention. 

[0056] Fig. 2 is the sequence of the 3' end of the N1 gene from normal mouse fibroblast cells (A31 ) [Seq. ID. No. 9]. 
25 [0057] Fig. 3 is the Northern blot of the N1 sequence on total cellular RNA from normal and tumorigenic mouse 
fibroblast cells. 

[0058] Fig. 4 is a sequencing gel showing the results of amplification for mRNA prepared from four sources (lanes 
14), using the Kozak primer alone, the AP-1 primer alone, the Kozak and AP-1 primers, the Kozak and AP-2 primers, 
the Kozak and AP-3 primers, the Kozk and AP-4 primers and the Kozak and AP-5 primers. This gel will be more fully 
30 described later. 

[0059] Fig. 5 is a partial sequence of the 5' end of a clone, K1, that was cloned from the A1-5 cell line that was 
cultured at the non-permissive temperature and then shifted to the permissive temperature (32.5°C) for 24 h prior to 
the preparation of the mRNA. The A1-5 cell line is from a primary rat embryo fibroblast cell line that has been doubly 
transformed with ras and a temperature sensitive mutation of P 53 ("P 53 ^). 



General Description, Development of the Method 

[0060] By way of illustration a description of examples of the method of the invention follows, with a description by 
way of guidance of how the particular illustrative examples were developed. 

40 [0061] It is important for operation of the method that the length of the oligodeoxynucleotide be appropriate for specific 
hybridization to mRNA. In order to obtain specific hybridization, whether for conventional cloning methods or PCR, 
oligodeoxynucleotides are usually chosen to be 20 or more nucleotides in length. The use of long oligodeoxynucleotides 
in this instance would decrease the number of mRNAs identified during each trial and would greatly increase the 
number of oligodeoxynucleotides required to identify every mRNA. Recently, it was demonstrated that 9-10 nucleotide 

45 primers can be used for DNA polymorphism analysis by PCR (Williams et ai, 1991, Nuc. Acids Res., Vol. 18, pp. 
6531-6535). 

[0062] The plasmid containing the cloned murine thymidine kinase gene ("TK cDNA plasmid") was used as a model 
template to determine the required lengths of oligodeoxynucleotides for specific hybridization to a mRNA, and for the 
production of specific PCR products. The oligodeoxynucleotide primer chosen to hybridize internally in the mRNA was 

50 varied between 6 and 1 3 nucleotides in length, and the oligodeoxynucleotide primer chosen to hybridize at the upstream 
end of the polyA tail was varied between 7 and 14 nucleotides in length. After numerous trials with different sets and 
lengths of primers, it was determined that the annealing temperature of 42°C is optimal for product specificity and the 
internally hybridizing oligodeoxynucleotide should be at least 9 nucleotides in length and a oligodeoxynucleotide that 
is at least 13 nucleotides in length is requirsd to bind to the upstream end of the polyA tail. 

55 [0063] With reference now to Fig. 1, the method according to the invention is depicted schematically. The mRNAs 
are mixed with the first primer, for example TTTTTTTTTTTVN [Seq. ID. No. 2] (T^VN) 1, and reverse transcribed 2 to 
make the first cDNA strand. The cDNA is amplified as follows. The first cDNA strand is added to the second primer 
and the first primer and the polymerase in the standard buffer with the appropriate concentrations of nucleotides and 
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the components are heated to 94°C to denature the mRNAxDNA hybrid 3, the temperature is reduced to 42°C to allow 
the second primer to anneal 4, and then the temperature is increased to 72°C to allow the polymerase to extend the 
second primer 5. The cycling of the temperature is then repeated 6, 7, 8, to begin the amplification of the sequences 
which are hybridized by the first and second primers. The temperature is cycled until the desired number of copies of 

5 each sequence have been made. 

[0064] As is well known in the art, this amplification method can be accomplished using thermal stable polymerase 
or a polymerase that is not thermal stable. When a polymerase that is not thermal stable is used, fresh polymerase 
must be added after the annealing of the primers to the templates at the start of the elongation or extending step, and 
the extension step must be carried out at a temperature that is permissible for the chosen polymerase. 

10 [0065] The following examples of the method of the invention are presented for illustrative purposes only. As will be 
appreciated, the method according to the invention can be used for the isolation of polyA mRNA from any source and 
can be used to isolate genes expressed either differentially or constitutively at any level, from rare to abundant. 



[0066] Experimentation with the conditions required for accurate and reproducible results by PCR were conducted 
with the TK cDNA plasmid and a single set of oligodeoxynucleotide primers; the sequence TTTTTTTTTTTCA ("T^CA") 
[Seq. ID. No. 10] was chosen to hybridize to the upstream end of the polyA tail and the sequence CTTGATTGCC 
("Ltk3") [Seq. ID. No. 11] was chosen to hybridize 288 base pairs ("bp") upstream of the polyA tail. The expected 

20 fragment size using these two primers is 299 bp. 

[0067] PCR was conducted under standard buffer conditions well known in the art with 10 ng TK cDNA plasmid 
(buffer and polymerase are available from Perkin Elmer-Cetus). The standard conditions were altered in that the primers 
were used at concentrations of 2.5 u.M T^CA [Seq. ID. No.10], 0.5 nM Ltk3 [Seq. ID. No. 11], instead of 1 \iM of each 
primer. The concentration of the nucleotides ("dNTPs") was also varied over a 100 fold range, from the standard 200 

25 uM to 2 u.M. The PCR parameters were 40 cycles of a denaturing step for 30 seconds at 94°C, an annealing step for 
1 minute at 42°C, and an extension step for 30 seconds at 72°C. Significant amounts of non-specific PCR products 
were observed when the dNTP concentration was 200 uM, concentrations of dNTPs at or below 20 uM yielded spe- 
cifically amplified PCR products. The specificity of the PCR products was verified by restriction endonuclease digest 
of the amplified DNA, which yielded the expected sizes of restriction fragments. In some instances it was found that 

30 the use of up to 5 fold more of the first primer than the second primer also functioned to increase the specificity of the 
product. Lowering the dNTP concentration to 2 |xM allowed the labelling of the PCR products to a high specific activity 
with [a- 35 S] dATP, 0.5 [a- 35 S] dATP (Sp. Act. 1 200 Ci/mmol), which is necessary for distinguishing the PCR products 
when resolved by high resolution denaturing polyacrylamide gel electrophoresis, in this case a DNA sequencing gel. 

35 Example 2 

[0068] The PCR method of amplification with short oligodeoxynucleotide primers was then used to detect a subset 
of mRNAs in mammalian cells. Total RNAs and mRNAs were prepared from mouse fibroblasts cells which were either 
growing normally, "cycling", or serum starved, "quiescent". The RNAs and mRNAs were reverse transcribed with T^CA 

40 [Seq. ID. No. 10] as the primer. The T^CA primer [Seq. ID. No. 10] was annealed to the mRNA by heating the mRNA 
and primer together to 65°C and allowing the mixture to gradually cool to 35°C. The reverse transcription reaction was 
carried out with Moloney murine leukemia virus reverse transcriptase at 35°C. The resultant cDNAs were amplified by 
PCR in the presence of T^CA [Seq. ID. No. 10] and Ltk3 [Seq. ID. No. 1.1], as described in Example 1, using 2 u,M 
dNTPs. The use of the T^CA [Seq. ID. No. 10] and Ltk3 [Seq. ID. No. 11] primers allowed the TK mRNA to be used 

45 as an internal control for differentia! expression of a rare mRNA transcript; TK mRNA is present at approximately 30 
copies per cell. The DNA sequencing gel revealed 50 to 100 amplified mRNAs in the size range which is optimal for 
further analysis, between 1 00 to 500 nucleotides. The patterns of the mRNA species observed in cycling and quiescent 
cells were very similar as expected, though some differences were apparent. Notably, the TK gene mRNA, which is 
expressed during G1 and S phase, was found only in the RNA preparations from cycling cells, as expected, thus 

50 demonstrating the ability of this method to separate and isolate rare mRNA species such as TK. 



[0069] The expression of mRNAs in normal and tumorigenic mouse fibroblast cells was also compared using the 
55 T^CAISeq. ID. No. 10] and Ltk3[Seq. ID. No. 11] primers for the PCR amplification. The mRNA was reverse transcribed 
using T 1t CA [Seq. ID. No. 1 0] as the primer and the resultant cDNA was amplified by PCR using 2 u.M dNTPs and the 
PCR parameters described above. The PCR products were separated on a DNA sequencing gel. The TK mRNA was 
present at the same level in both the normal and tumorigenic mRNA preparations, as expected, and provided a good 
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internal control to demonstrate the representation of rare mRNA species. Several other bands were present in one 
preparation and not in the other, with a few bands present in only the mRNA from normal cells and a few bands present 
only in the mRNA from the tumorigenic cells; and some bands were expressed to different levels in the normal and 
tumorigenic cells. Thus, the method according to the invention can be used to identify genes which are normally con- 
5 tinuously expressed (constitutive), and differentially expressed, suppressed, or otherwise altered in their level of ex- 
pression. 

Cloning of the mRNA identified in Example 3 

10 [0070] Three cDNAs that are, the TK cDNA, one cDNA expressed only in normal cells ("NT), and one cDNA ex- 
pressed only in tumorigenic cells (Tr), were recovered from the DNA sequencing gel by electroelution, ethanol pre- 
cipitated to remove the urea and other contaminants, and reamplified by PCR, in two consecutive PCR amplifications 
of 40 cycles each, with the primers T n CA [Seq. ID. No. 10] and Ltk3 [Seq. ID. No. 11] in the presence of 20u.M dNTPs 
to achieve optimal yield without compromising the specificity. The reamplified PCR products were confirmed to have 

15 the appropriate sizes and primer dependencies as an additional control the reamplified TK cDNA was digested with 
two separate restriction endonucleases and the digestion products were also confirmed to be of the correct size. 
[0071] The reamplified N1 [Seq. ID. No. 9] was cloned with the TA cloning system, Invitrogen Inc., into the plasmid 
pCR1000 and sequenced. With reference now to Fig. 2, the nucleotide sequence clearly shows the N1 fragment [Seq. 
ID. No. 9] to be flanked by the underlined Ltk3 primer 15 at the 5' end and the underlined T^CA primer 16 at the 3* 

20 end as expected. 

[0072] A Northern analysis of total cellular RNA using a radiolabelled N1 probe reconfirmed that the N1 mRNA was 
only present in the normal mouse fibroblast cells, and not in the tumorigenic mouse fibroblast cells. With reference 
now to Fig. 3, the probe used to detect the mRNA is labelled to the right of the figure, and the size of the N1 mRNA 
can be estimated from the 28S and 18S markers depicted to the left of the figure. The N1 mRNA is present at low 
25 abundance in both exponentially growing and quiescent normal cells, lanes 1 and 3, and is absent from both expo- 
nentially growing or quiescent tumorigenic cells, lanes 2 and 4. As a control, the same Northern blot was reprobed with 
a radiolabelled probe for 36B4, a gene that is expressed in both normal and tumorigenic cells, to demonstrate that 
equal amounts of mRNA, lanes 1-4, were present on the Northern blot. 

30 Example 4 

[0073] The comparison of the expression of mRNAs in three cell lines, one of which was tested after culturing under 
two different conditions, was conducted. The cell lines were a primary rat embryo fibroblast cell line ("REF"), the REF 
cell line that has been doubly transformed with ras and a mutant of P 53 ("T101-4"), and the REF cell line that has been 
35 doubly transformed with ras and a temperature sensitive mutation of P 53 ("A1-5"). The A1-5 cell line was cultured at 
the non-permissive temperature of 37°C, and also cultured at 37°C then shifted to the permissive temperature of 32.5°C 
for 24 h prior to the preparation of the mRNA. The method of the invention was conducted using the primers "Kozak" 
and one of five arbitrary sequence primers, "AP-1, AP-2, AP-3. AP-4, or AP-5", as the second and first primers, re- 
spectively. 

40 [0074] The sequence of the "Kozak" primer was chosen based upon the published consensus sequence for the 
translation start site consensus sequence of mRNAs (Kozak, 1991, Jour. Cell Biology, Vol. 115, pp. 887-903). A de- 
generate Kozak primer having sequences substantially identical to the translation start site consensus sequence were 
used simultaneously, these sequences were 5'-GCCRCCATGG [Seq. ID No. 12], in which the R is dA or dG and thus 
the oligodeoxynucleotide primer has only one of the given nucleotides which results in a mixture of primers. 

45 [0075] The sequence of the five arbitrary primers was a follows: AP-1 had the sequence 5'-AGCCAGCGAA [Seq. 
ID. No. 13]; AP-2 had the sequence 5'-GACCGCTTGT [Seq. ID. No. 14]; AP-3 had the sequence 5*-AGGTGACCGT 
[Seq. ID. No. 15]; AP-4 had the sequence 5'-GGTACTCCAC [Seq. ID. No. 16]; and AP-5 had the sequence 5'-GTT- 
GCGATCC [Seq. ID. No. 17]. These arbitrary sequence primers were chosen arbitrarily. In general each arbitrary 
sequence primer was chosen to have a GC content of 50-70%. 

50 [0076] The mRNA was reverse transcribed using one of the AP primers, as the first primer, and the resultant first 
cDNA strand was amplified in the presence of both primers, the AP primer and the degenerate Kozak primer, by PCR 
using 2 jiM NTPs and the PCR parameters described above. The PCR products were separated on a DNA sequencing 
gel. At least 50-1 00 amplified cDNA bands were present in each of the cell lines tested , and some bands were expressed 
to different levels in the different cell lines. As a control a reaction was conducted using each arbitrary primer in the 

55 absence of the Kozak primer. No cDNA was generated by the arbitrary primer alone, thus demonstrating that both 
primers were required to amplify an mRNA into a cDNA. 

[0077] With reference now to Fig. 4, the primer sets used for each reaction are shown at the top of the Fig. along 
the line marked Primers. As a control a reaction was conducted using the primers in the absence of mRNA, and using 
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AP-1 with mRNA in the absence of the Kozak primer. No cDNA was genezated by the primers in the absence of mRNA 
or by the arbitrary primer alone, thus demonstrating that mRNA is required for amplification and that both primers were 
required to amplify an mRNA into a cDNA. The cDNA products of the amplification were loaded in the same order 
across the gel, thus the REF cell line is shown in each of lanes 1, cell line T101-4 is shown in each of lanes 2, cell line 

5 A1-5 cultured at 37°C is shown in each of lanes 3, and cell line A1-5 cultured at 32.5°C is shown in each of lanes 4. 
Each pair of primers resulted in the amplification of a different set of mRNAs from the cell lines. The reactions which 
were conducted using the Kozak primer and any of primers AP-1 , AP-2, AP-4, or AP-5 as a primer set resulted in the 
amplification of the same cDNA pattern from each of cell lines REF, T101-4, A1-5 cultured at 37°C and A1-5 cultured 
at 32.5°C. The amplification of mRNA from each cell line and temperature using the Kozak degenerate primer and the 

10 AP-3 primer resulted in the finding of one band in particular which was present in the mRNA prepared from the A1-5 
cell line when cultured at 32.5°C for 24 h, and not in any of the other mRNA preparations, as can be seen in Fig. 4 
designated as K v Thus the method according to the invention may be used to identify genes which are differentially 
expressed in mutant cell lines. 

15 Cloning of the mRNA identified in Example 4 

[0078] The cDNA ("K1 rt ) that was expressed only in the A1-5 cell line when cultured at 32.5°C was recovered from 
the DNA sequencing gel and reamplified using the primers Kozak and AP-3 as described above. The reamplified K 1 
cDNA was confirmed to have the appropriate size of approximately 450 bp, and was cloned with the TA cloning system, 

20 Invitrogen Inc., into the vector pCRII (Invitrogen, Inc.) according to the manufacturers instructions, and sequenced. 
With reference now to Fig. 5, the nucleotide sequence clearly shows the K 1 clone to be flanked by the underlined Kozak 
primer 20 at the 5' end and the underlined AP-3 primer 21 at the 3' end as expected. The 5' end of this partial cDNA ' 
is identified in Seq. ID No. 18, and the 3' end of this cDNA is identified in Seq. ID No. 19. This partial sequence is an 
open reading frame, and a search of the gene databases EMBO and Genbank has revealed the translated amino acid 

25 sequence from the 3' portion of to be homologous to the ubiquitin conjugating enzyme family (UBC enzyme). The 
translated amino acid sequence of the 3' portion of K 1 is 100% identical to a UBC enzyme from D. melanogaster and 
75 % identical to the UBC-4 enzyme and 79% identical to the UBC-5 enzyme from the yeast S. saccharomyces: and 
75% identical to the UBC enzyme from Arabidopsis thaliana. The K } clone may contain the actual 5' end of this gene, 
otherwise the Kozak primer hybridized just after the 5' end. This result demonstrates that the method according to the 

30 invention can be used to clone the 5' coding sequence of a gene 

Use 

[0079] The method according to the invention can be used to identify, isolate and clone mRNAs from any number of 
35 sources. The method provides for the identification of desirable mRNAs by simple visual inspection after separation, 
and can be used for investigative research, industrial and medical applications. 

[0080] For instance, the reamplified cDNAs can be sequenced, or used to screen a DNA library in order to obtain 
the full length gene. Once the sequence of the cDNA is known, amino acid peptides can be made from the translated 
protein sequence and used to raise antibodies. These antibodies can be used for further research of the gene product 
40 and its function, or can be applied to medical diagnosis and prognosis. The reamplified cDNAs can be cloned into an 
appropriate vector for further propagation, or cloned into an appropriate expression vector in order to be expressed, 
either in vitro or in vivo. The cDNAs which have been cloned into expression vectors can be used in industrial situations 
for overproduction of the protein product. In other applications the reamplified cDNAs or their respective clones will be 
used as probes for in situ hybridization. Such probes can also be used for the diagnosis or prognosis of disease. 

45 

Other Embodiments 

[0081] Other embodiments are within the following claims. 

[0082] The length of the oligodeoxynucleotide can be varied dependent upon the annealing temperature chosen. In 
50 the preferred embodiments the temperature was chosen to be 42°C and the oligonucleotide primers were chosen to 
be at least 9 nucleotides in length. If the annealing temperature were decreased to 35°C then the oligonucleotide 
lengths can be decreased to at least 6 nucleotides in length. 

[0083] The cDNA could be radiolabeled with radioactive nucleotides other than 35 S, such as 32 P and 33 P. When 
desired, non-radioactive imaging methods can also be applied to the method according to the invention. 
55 [0084] The amplification of the cDNA could be accomplished by a temperature cycling polymerase chain reaction, 
as was described, using a heat stable DNA polymerase for the repetitive copying of the cDNA while cycling the tem- 
perature for continuous rounds of denaturation, annealing and extension. Or the amplification could be accomplished 
by an isothermal DNA amplification method (Walker et a/., 1992, Proc. Natl. Acad. ScL, Vol. 89, pp. 392-396). The 
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isothermal amplification method would be adapted to use for amplifying cDNA by including an appropriate restriction 
endonuclease sequence, one that will be nicked at hemiphosphorothioate recognition sites and whose recognition site 
can be regenerated during synthesis with a 35 S labelled dNTPs. 

[0085] Proteins having similar function or similar functional domains are often referred to as being part of a gene 

5 family. Many such proteins have been cloned and identified to contain consensus sequences which are highly con- 
served amongst the members of the family. This conservation of sequence can be used to design oligodeoxynucleotide 
primers for the cloning of new members, or related members, of a family. Using the method of the invention the mRNA 
from a cell can be reverse transcribed, and a cDNA could be amplified using at least one primer that has a sequence 
substantially identical to the sequence of a mRNA of known sequence. Consensus sequences for at least the following 

10 families and functional domains have been described in the literature: protein tyrosine kinases (Hanks et ai, 1991, 
Methods on Enzymology, Vol. 200, pp. 38-81; Wilks, 1991, Methods in Enzymology, Vol. 200, pp. 533-546); homeobox 
genes; zinc-finger DNA binding proteins (Miller et at, 1985, EMBO Jour., Vol. 4, pp. 1609-1614); receptor proteins; the 
signal peptide sequence of secreted proteins; proteins that localize to the nucleus (Guiochon-Mantel et at., 1989, Vol. 
57, pp. 1147-1154); serine proteases; inhibitors of serine proteases; cytokines; the SH2 and SH3 domains that have 

75 been described in tyrosine kinases and other proteins (Pawson era/., 1992, Cell, Vol 71 , pp. 359-362); serine/threonine 
and tyrosine phosphatases (Cohen, 1 991 , Methods in Enzymology, Vol. 201 , pp. 398-408); cyclins and cyclin-depend- 
ent protein kinases (CDKs) (see for ex., Keyomarsi et a/., 1993, Proc. Natl. Acad. Sci., USA, Vol. 90, pp. 1112-1116). 
[0086] Primers for any consensus sequence can readily be designed based upon the codon usage of the amino 
acids. The incorporation of degeneracy at one or more sites allows the designing of a primer which will hybridize to a 

20 high percentage, greater than 50%, of the mRNAs containing the desired consensus sequence. 

[0087] Primers for use in the method according to the invention could be designed based upon the consensus se- 
quence of the zinc finger DNA binding proteins, for example, based upon the amino acid consensus sequence of the 
proteins PYVC. Useful primers for the cloning of further members of this family can have the following sequences: 
5-GTAYGCNTGT [Seq. ID. No. 20] or 5'-GTAYGCNTGC [Seq. ID. No. 21 ), in which the Y refers to the deoxynucleotides 

25 dT or dC for which the primer is degenerate at this position, and the N refers to inosine (T). The base inosine can pair 
with all of the other bases, and was chosen for this position of the oligodeoxynucleotide as the codon for valine "V" is 
highly degenerate in this position. The described oligodeoxynucleotide primers as used will be a mixture of 5'-GTAT- 
GCITGT and 5M3TACGCITGT or a mixture of 5-GTATGCITGC and 5'-GTACGCITGC. 
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40 
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(iv) CORRESPONDENCE ADDRESS: 
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50 
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(B) COMPUTER: IBM PC compatible 
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(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

5 

(A) APPLICATION NUMBER: US 07/850,343 

(B) FILING DATE: 11-MAR-1992 

(viii) ATTORNEY/AGENT INFORMATION: 

10 

(A) NAME: Pasternack, Sam 

(B) REGISTRATION NUMBER: 29,576 

(C) REFERENCE/DOCKET NUMBER: DFCI234CIP 

15 (ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 617 227-5020 

(B) TELEFAX: 617 227-7566 

20 (2) INFORMATION FOR SEQ ID NO:1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
30 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: 



35 TiTiTrrrrr tvn 

(2) INFORMATION FOR SEQ ID NO:2: 
(i) SEQUENCE CHARACTERISTICS: 



40 
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(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 



TTTTTTTTTT TTV 

(2) INFORMATION FOR SEQ ID NO:3: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 13 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

10 

m ' m ' i ' i ' n 9 vkn 



(2) INFORMATION FOR SEQ ID NO:4: 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 



NNTTEMTNN 



30 (2) INFORMATION FOR SEQ ID NO:5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
^0 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 



45 GCTTTATTNC 

(2) INFORMATION FOR SEQ ID NO:6: 

(i) SEQUENCE CHARACTERISTICS: 

50 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

55 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



14 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 

NNHANNATCN 

5 

(2) INFORMATION FOR SEQ ID NO:7: 

(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 

20 

NNNANNATCG 10 

(2) INFORMATION FOR SEQ ID NO:8: 
25 (j) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 



GCCACCATGG 

40 (2) INFORMATION FOR SEQ ID NO:9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 260 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
50 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: 



55 
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CITGATTGCC TCCTACAGCA GTTGCAGGCA CCTTCA6CT3 TACCATGAW3 TTCACAGTCC $0 
GGGATTQTGA CCCTAAXACT GGAGTTCCAO ATGAAGATGG AXATGATGAT GAATATGTGC 120 
TGGAAGATCT TGAGGTAACT GTGTCTGATC ATATTCAGAA GATACTAAAA CCTAACTTCG 180 
CTGCTGCCTG GGAAGAGGTG GGAGGAGCAG CTGCGACAGA GCGTCCTCTT CACAGAGGG6 240 



(2) INFORMATION FOR SEQ ID NO:10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 



TTTTTITTTT TCA X3 

(2) INFORMATION FOR SEQ ID NO:11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11: 



GTTGATTCCC 10 

(2) INFORMATION FOR SEQ ID NO:12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: 



TCCTCGGTGA AAAAAAAAAA 



260 
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GCCRCCATGG 

(2) INFORMATION FOR SEQ ID NO:13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 
AGCCAGCGAA 



(2) INFORMATION FOR SEQ ID NO:14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 



GACCGCTTGT 



(2) INFORMATION FOR SEQ ID NO:15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: 



AGQTGACCGT 



(2) INFORMATION FOR SEQ ID NO:16: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 

GGTACTCCAC 10 

(2) INFORMATION FOR SEQ ID NO:17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 
25 (iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: 

GTTGCGATCC 10 

30 

(2) IMFOKMATIOM FOR SEQ ID NO:18: 

(i) SEQUENCE CHARACTERISTICS: 

35 (A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 

45 

GCCGCCATGG CTCTGAAGAG AATCCACAAG GACACCCATG AA 42 

(2) INFORMATION FOR SEQ ID NO:19: 
50 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
55 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: 

GTTGCATTTA CAACAAGAAX TTATCATCCA AATATTAACA GTAATGGCAG CATTTGTCTT 60 



(2) INFORMATION FOR SEQ ID NO:20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

GTAYGCMTGT 

(2) INFORMATION FOR SEQ ID NO:21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 



GTAYGCNTGC 



1. A method for isolating a DNA complementary to a mRNA in a nucleic acid sample comprising the steps of: 

a) contacting the sample with a first oligonucleotide primer under conditions in which said first primer hybridizes 
with any mRNA at a first site having a complementary base sequence; 

b) reverse transcribing the mRNA using a reverse transcriptase and said first primer to produce a first DNA 
strand complementary to at least a portion of the mRNA upstream from said first site; 

c) contacting the first DNA strand with a second oligodeoxynucleotide primer under conditions in which said 
second primer hybridizes with complementary DNA at a second site; 

d) extending the second primer using a DNA polymerase to produce a second DNA strand complementary to 
the first DNA strand downstream from said second site; and 

e) amplifying the first and second DNA strands using a polymerase, said first primer and said second primer 
to form the complementary DNA; wherein: 



i) said first primer hybridizes with mRNA at a site that includes a polyA signal sequence; and/or 



GATATTCTAC GGTCACCT 



78 
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ii) said first primer hybridizes at a portion of the polyadenosine (polyA) tail of said mRNA and at least one 
non-polyA nucleotide immediately upstream of said portion; and/or 

iii) said first primer hybridizes at a site including a sequence immediately upstream of a first A ribonucle- 
otide of the mRNAs polyA tail; and/or 

iv) said second primer with a base sequence of at least 6 nucleotides and containing an arbitrary sequence 
and a base sequence containing a Kozak sequence. 

The method of claim 1, wherein said first primer: 

a) hybridizes with mRNA that includes at least two nucleotides upstream from and adjacent to the first A 
ribonucleotide of the polyA tail; 

b) includes at least 13 nucleotides; 

c) includes a polyA-complementary region comprising at least 11 nucleotides and, upstream from said polyA- 
complementary region, a non-polyA complementary region comprising at least one nucleotide optionally the 
non-polyA complementary region comprising at least 2 contiguous nucleotides. 

The method of claim 2c), wherein said non-polyA-complementary region comprises 3-NV, wherein V is one of 
deoxyadenosine, deoxycytidine, or deoxyguanosine, and N is one of deoxyadenosine, deoxycytidine, deoxygua- 
nosine, or deoxythymidine. 

The method of claim 1a), wherein said first primer comprises at least 6 deoxyribonucleotides. 
The method of any one of claims 1 to 4, wherein 

a) said second primer comprises at least 6 deoxyribonucleotides; or 

b) said second primer includes a randomly selected nucleotide sequence; or 

c) said first or the second primer includes deoxyadenosine, deoxycytidine, deoxyguanosine, and deoxythymi- 
dine; or 

d) said first or second primer includes a restriction endonuclease recognition sequence; or 

e) said second primer includes a sequence identical to a sequence contained within a mRNA of known se- 
quence; or 

f) at least one of said first or second primers comprises a plurality of oligodeoxynucleotides. 

A method according to claim 1, comprising: 

contacting the mRNA with the first primer under conditions in which said first primer hybridizes with mRNA at 
a site, 

reverse transcribing the mRNA using the reverse transcriptase and said first primer, to produce the first DNA 
strand, 

contacting the first DNA strand with the second primer under conditions in which said second primer hybridizes 
with the first DNA strand at the second site, which includes a Kozak sequence, 

extending the second primer using a DNA polymerase to produce a second DNA strand complementary to 
the first DNA strand downstream from the site of hybridization of said second primer with said first DNA strand, 
and 

amplifying the first and second DNA strands using a DNA polymerase and said first and second primers. 

The method of claim 6, wherein said first primer includes a sequence substantially identical to a sequence contained 
within an mRNA of known sequence. 

The method according to any one of the preceding claims, wherein 

a) said first primer comprises at least 9 deoxyribonucleotides, optionally at least 10 deoxyribonucleotides; or 

b) said second primer comprises at least 9 deoxyribonucleotides, optionally at least 10 deoxyribonucleotides; 
or 

c) said first primer includes a selected arbitrary sequence of deoxyribonucleotides; or 

d) said first primer or said second primer includes a restriction endonuclease recognition sequence; or 

e) at least one of said first or second primers comprises a plurality of oligodeoxynucleotides. 
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9. A method of comparing the presence or level of individual mRNA molecules in two or more nucleic acid samples, 
comprising the steps of: 

a) providing a first nucleic acid sample including mRNA molecules and performing the method of any one of 
5 claims 1 to 9 thereupon to produce a first population of amplification products comprising the complementary 

DNA; 

b) providing a second nucleic acid sample including mRNA molecules and performing the method of any one 
of claims 1 to 9 thereupon to produce a second population of amplification products comprising the comple- 
mentary DNA; 

10 c) comparing the presence or level of individual amplification products in the first and second populations of 

amplification products. 

10. The method of claim 9, wherein: 

is a) said first nucleic acid sample comprises mRNAs expressed in a first cell and said second nucleic acid 

sample consists of mRNAs expressed in a second cell; or 

b) said first nucleic acid sample comprises mRNAs expressed in a cell at a first developmental stage and said 
second nucleic acid sample comprises mRNAs expressed in said cell at a second developmental stage. 

20 11. The method of claim 9 or 10, wherein said first primer includes a polyA-complementary region comprising at least 
11 nucleotides and, immediately downstream from said polyA-complementary region, a non-polyA-complementary 
region comprising at least one nucleotide, and optionally said polyA-complementary region comprises at least 11 
contiguous thymidines. 

25 12. The method of claim 11 , wherein said first primer comprises at least 1 3 nucleotides. 

13. The method of claim 9, wherein the nucleotide sequence of said first or said second primer contains a restriction 
endonuclease recognition site, and optionally at least one of said first or second primers comprises a plurality of 
oligodeoxynucleotides, and further optionally said plurality of oligonucleotides comprises a plurality of oligodeox- 

30 ynucleotide molecules having the same nucleotide sequence, or individual oligodeoxynucleotide molecules in said 

plurality of oligodeoxynucleotides have different nucleotide sequences. 

14. The method according to any one of claims 9 to 13, further comprising the step of detecting a difference in the 
presence or level of an individual amplification product in said first population of amplification products as compared 

35 with said second population of amplification products. 

15. The method according to any one of claims 9 to 14, wherein the amplifying steps each comprise performing a 
polymerase chain reaction in which the concentration of dNTPs is at or below approximately 20uM, and/or the 
amplifying steps each comprise performing a polymerase chain reaction in which the concentration of dNTPs is 

40 approximately 2u,M. 

16. The method according to any one of claims 9 to 15, wherein the step of comparing comprises resolving each of 
said first and second populations of amplification products by gel electrophoresis and comparing the presence or 
level of bands of particular sizes. 

45 

17. The method according to any one of claims 9 to 16, wherein said first cell comprises a tumorigenic cell and said 
second cell comprises a normal cell. 

18. The method according to any one of claims 9 to 17, further comprising a step of cloning individual amplification 
so products from said first or second populations of amplification products. 

19. The method according to any one of claims 9 to 18, wherein the second primer that hybridizes to a second site in 
said first and second samples hybridizes to the second site which includes NNNRNNATGN. 

55 20. The method according to any one of claims 9 to 19, wherein said first primer has a GC content within the range 
of about 50-70%. 

21. The method according to any one of claims 9 to 20, wherein the nucleotide sequence of said first primer includes 
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a sequence substantially complementary to a consensus sequence found in a gene family. 

22. The method according to any one of claims 9 to 21, wherein said first and second sites are separated from one 
another so that at least some amplification products in said first and second populations of amplification products 

5 have a size in the range of approximately 100-500 basepairs. 

23. The method according to claim 14, further comprising a step of isolating an individual amplification product whose 
presence or level differs in said first and second populations of amplification products. 

10 24. The method of claim 23, further comprising a step of cloning said isolated amplification product into a vector. 

25. The method of claim 23 or 24, further comprising either a step of screening a nucleic acid library with said isolated 
amplification product, or a step of determining the nucleotide sequence of at least a portion of said isolated am- 
plification product. 

15 

Patentanspruche 

1. Ein Verfahren zur Isolierung einer DNA, die zu einer mRNA in einer NucleinsSure-Probe komplementar ist, um- 
20 fassend die Schritte: 

a) Inkontaktbringen der Probe mit einem ersten Oligonucleotid-Primer unter Bedingungen, bei denen besagter 
erster Primer mit beliebiger mRN A an einer ersten Stelle mit einer komplementaren Basensequenz hybridisiert; 

b) reverses Transkribieren der mRMA unter Verwendung einer reversen Transkriptase und besagten ersten 
25 Primers, urn einen ersten DNA-Strang herzustellen, der mindestens zu einem Teil der mRNA stromaufwarts 

von besagter erster Stelle komplementar ist; 

c) Inkontaktbringen des ersten DNA-Strangs mit einem zweiten Oligodesoxynucleotid-Primer unter Bedingun- 
gen, bei denen besagter zweiter Primer mit komplemertarer DNA an einer zweiten Stelle hybridisiert; 

d) Extension des zweiten Primers unter Verwendung einer DNA-Polymerase, urn einen zweiten DNA-Strang 
30 herzustellen, der zu dem ersten DNA-Strang stromabwarts von besagter zweiter Stelle komplementar ist; und 

e) Amplifizieren des ersten und zweiten DNA-Strangs unter Verwendung einer Polymerase, besagten ersten 
Primers und besagten zweiten Primers, urn die komplementare DNA zu bilden: worin: 

i) besagter erster Primer mit mRMA an einer Stelle hybridisiert, die eine polyA-Signalsequenz mit ein- 
35 schliefct; und/oder 

ii) besagter erster Primer an einen Teil des Polyadenosin (polyA) Schwanzes von besagter mRNA und 
mindestens an ein nicht-polyA Nucleotid unmittelbar stromaufwarts von besagtem Teil hybridisiert; und/ 
Oder 

iii) besagter erster Primer an eine Stelle hybridisiert, einschlieBlich einer Sequenz unmittelbar stromauf- 
40 warts eines ersten A-Ribonucleotids des polyA Schwanzes der mRNA; und/oder 

iv) besagter zweiter Primer mit einer Basensequenz aus mindestens 6 Nucleotiden und enthaltend eine 
beliebige Sequenz und eine Basensequenz, die eine Kozak-Sequenz enthait. 

2. Das Verfahren nach Anspruch 1 , worin besagter erster Primer: 

45 

a) mit mRNA hybridisiert, die mindestens zwei Nucleotide stromaufwarts von und angrenzend an das erste 
A-Ribonucleotid des polyA-Schwanzes mit einschliefit; 

b) mindestens 13 Nucleotide umfasst; 

c) eine polyA-komplementare Region, umfassend mindestens 11 Nucleotide und, stromaufwarts von besagter 
50 potyA-komplementarer Region, eine nicht-polyA-komplementare Region mit einschliefit, umfassend minde- 
stens ein Nucleotid gegebenenfalls die nicht-polyAkomplementSre Region, umfassend mindestens 2 benach- 
barte Nucleotide. 

3. Das Verfahren nach Anspruch 2 c), worin besagte nicht-polyA-komplementare Region 3-NV umfasst, worin V ein 
55 Desoxyadenosin, Desoxycytidin Oder Desoxyguanosin ist und N ein Desoxyadenosin, Desoxycytidin, Desoxygua- 

nosin Oder Desoxythymidin ist. 

4. Das Verfahren nach Anspruch 1a), worin besagter erster Primer mindestens 6 Desoxyribonucleotide umfasst. 
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5. Das Verfahren nach einem der Anspruche 1 bis 4, worin 

a) besagter zweiter Primer mindestens 6 Desoxyribonucleotide umfasst; oder 

b) besagter zweiter Primer eine zufdllig ausgewdhlte Nucleotid-Sequenz mit einschliefit; oder 

c) besagter erster oder der zweite Primer Desoxyadenosin, Desoxycytidin, Desoxyguanosin und Desoxythy- 
midin mit einschlielit; Oder 

d) besagter erster oder zweiter Primer eine Restriktionsendonuclease-Erkennungssequenz mit einschlielit; 
oder 

e) besagter zweiter Primer eine Sequenz mit einschlielit, die mit einer Sequenz identisch ist, die in einer mRNA 
bekannter Sequenz enthalten ist; oder 

f) mindestens besagter erster oder zweiter Primer eine Vielzahl an Oligodesoxynucleotiden umfasst. 

6. Ein Verfahren nach Anspruch 1 , umfassend: 

Inkontaktbringen der mRNA mit dem ersten Primer unter Bedingungen, bei denen besagter erster Primer mit 
mRNA an einer Stelle hybridisiert, 

reverses Transkribieren der mRMA unter Verwendung der reversen Transkriptase und besagten ersten Pri- 
mers, urn einen ersten DNA-Strang herzustellen, Inkontaktbringen des ersten DNA-Strangs mit dem zweiten 
Primer unter Bedingungen, bei denen besagter zweiter Primer mit dem ersten DNA-Strang an der zweiten 
Stelle hybridisiert, der eine Kozak-Sequenz mit einschlielit, 

Extension des zweiten Primers unter Verwendung einer DNA-Polymerase, urn einen zweiten DNA-Strang 
herzustellen, der zu dem ersten DNA-Strang stromabwarts von der Stelle, wo besagter zweiter Primer mit 
besagtem ersten DNA-Strang hybridisiert, komplementar ist, und Amplifizieren des ersten und zweiten 
DNA-Strangs unter Verwendung einer DNA-Polymerase und besagten ersten und zweiten Primers. 

7. Das Verfahren nach Anspruch 6, worin besagter erster Primer eine Sequenz mit einschlielit, die im Wesentlichen 
mit einer Sequenz identisch ist, die in einer mRNA bekannter Sequenz enthalten ist. 

8. Das Verfahren gemSli einem der vorausgehenden AnsprOche, worin 

a) besagter erster Primer mindestens 9 Desoxyribonucleotide, gegebenenfalls mindestens 10 Desoxyribinu- 
cleotide, umfasst; oder 

b) besagter zweiter Primer mindestens 9 Desoxyribonucleotide, gegebenenfalls mindestens 1 0 Desoxyribinu- 
cleotide, umfasst; oder 

c) besagter erster Primer eine ausgewShlte beliebige Sequenz aus Desoxyribonucleotiden mit einschlielit; 
oder 

d) besagter erster Oder besagter zweiter Primer eine Restriktionsendonuclease-Erkennungssequenz mit ein- 
schlielit; oder 

e) mindestens besagter erster Primer oder zweiter Primer eine Vielzahl an Oligodesoxynucleotiden umfasst. 

9. Ein Verfahren zum Vergleich des Vorhandenseins oder Levels individueller mRNA-MolekOle in zwei Oder mehr 
NucleinsSure-Proben, umfassend die Schritte: 

a) Bereitstellen einer ersten mRNA-Molekule enthaltenden NucleinsSure-Probe und Durchfuhrung des Ver- 
fahrens nach einem der Anspruche 1 bis 9, urn eine erste Population Amplifikationsprodukte herzustellen, die 
die komplementare DNA umfassen; 

b) Bereitstellen einer zweiten mRNA-Molekule enthaltenden NucleinsSure-Probe und Durchfuhrung des Ver- 
fahrens nach einem der Anspruche 1 bis 9, urn eine zweite Population Amplifikationsprodukte herzustellen, 
die die komplementare DNA umfassen; 

c) Vergleich des Vorhandenseins oder Levels individueller Amplifikationsprodukte in der ersten und zweiten 
Population Amplifikationsprodukte. 

10. Das Verfahren nach Anspruch 9, worin: 

a) besagte erste NucleinsSure-Probe mRNAs umfasst, die in einer ersten Zelle exprimiert sind, und besagte 
zweite NucleinsSure-Probe aus mRNAs besteht, die in einer zweiten Zelle exprimiert sind; oder 

b) besagte erste NucleinsSure-Probe mRNAs umfasst, die in einer Zelle in einem ersten Entwicklungsstadium 
exprimiert sind, und besagte zweite NucleinsSure-Probe mRNAs umfasst, die in besagter Zelle in einem zwei- 
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ten Entwicklungsstadium exprimiert sind. 

11. Das Verfahren nach Anspruch 9 Oder 10, worin besagter erster Primer eine polyA-kompIementare Region, um- 
fassend mindestens 11 Nucleotide und, unmittelbar stromabwarts von besagter polyA-komplementarer Region, 

5 eine nicht-polyA-komplementare Region mit einschlielit, umfassend mindestens ein Nucleotid, und gegebenenfalls 

besagte polyA-komplementare Region mindestens 11 aneinander liegende Thymidine umfasst 

12. Das Verfahren nach Anspruch 11, worin besagter erster Primer mindestens 13 Nucleotide umfasst. 

1 3. Das Verfahren nach Anspruch 9, worin die Nucleotid-Sequenz von besagtem ersten oder besagtem zweiten Primer 
eine Restriktionsendonuclease-Erkennungsstelle enthait und gegebenenfalls mindestens besagter erster oder be- 
sagter zweiter Primer eine Vielzahl an Oligodesoxynucleotiden umfasst, und aufierdem gegebenenfalls besagte 
Vielzahl an Oligonucleotiden eine Vielzahl an Oligodesoxynucleotid-Molekulen mit der gleichen Nucleotid-Se- 
quenz umfasst, oder einzelne Oligodesoxynuoleotid-MolekOle in besagter Vielzahl an Oligodesoxynucleotiden ver- 
schiedene Nuclectid-Sequenzen besitzen. 

14. Das Verfahren gemaii einem der Anspruche 9 bis 13, aufierdem umfassend den Schritt der Detektion einer Dif- 
ferenz bei Vorhandensein oder Level eines individuellen Amplifikationsprodukts in besagter erster Population Am- 
plifikationsprodukte verglichen mit besagter zweiter Population Amplifikationsprodukte. 

20 

15. Das Verfahren gemaii einem der Anspruche 9 bis 14, worin die Amplifizierungsschritte jeweils eine Durchfuhrung 
einer Polymerase-Kettenreaktion umfassen, bei der die Konzentration an dNTPs bei oder unter ungefahr 20 uM 
liegt, und/oder die Amplifizierungsschritte jeweils eine Durchfuhrung einer Polymerase-Kettenreaktion umfassen, 
bei der die Konzentration an dNTPs ungefahr 2 u.M ist. 
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16. Das Verfahren gemaii einem der Anspruche 9 bis 15, worin der Vergleichsschritt eine Auftrennung jeweils der 
ersten und zweiten Population Amplifikationsprodukte mittels Gelelektrophorese und Vergleich des Vorhanden- 
seins Oder Levels von Banden bestimmter Gr6fcen umfasst. 

30 17. Das Verfahren gemaft einem der Anspruche 9 bis 16, worin besagte erste Zelle eine Tumoren-bildende Zelle 
umfasst und besagte zweite Zelle eine gesunde Zelle umfasst. 

18. Das Verfahren gemSli einem der Anspruche 9 bis 17, au&erdem umfassend einen Schritt der Klonierung indivi- 
dueller Amplifikationsprodukte von besagter erster oder zweiter Population Amplifikationsprodukte. 
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19. Das Verfahren gemaR einem der AnsprOche 9 bis 1 8, worin der zweite Primer, der an eine zweite Stelle in besagter 
erster und zweiter Probe hybridisiert, an die zweite Stelle, die NNNRNNATGN einschliefct, hybridisiert. 

20. Das Verfahren gemaii einem der Anspruche 9 bis 19, worin besagter erster Primer einen GC-Gehalt im Bereich 
40 von etwa 50-70% besitzt. 

21. Das Verfahren gemafc einem der Anspruche 9 bis 20, worin die Nucleotid-Sequenz von besagtem ersten Primer 
eine Sequenz mit einschlielit, die zu einer in einer Genfamilie gefundenen Consensus-Sequenz komplementar ist, 

45 22. Das verfahren gemafc einem der AnsprOche 9 bis 21 , worin besagte erste und zweite Stelle voneinander getrennt 
sind, so dass mindestens einige Amplifikationsprodukte in besagter erster und zweiter Population Amplifikations- 
produkte eine GrOfie im Bereich von ungefahr 100-500 Basenpaare besitzen. 

23. Das Verfahren, gemafi Anspruch 14, aulierdem umfassend einen Schritt der Isolierung eines individuellen Ampli- 
50 fikationsprodukts, dessen Vorhandensein Oder Level in besagter erster und zweiter Population Amplifikationspro- 
dukte unterschiedlich ist. 

24. Das Verfahren nach Anspruch 23, aufterdem umfassend einen Schritt der Klonierung besagten isolierten Ampli- 
fikationsprodukts in einen Vektor. 
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25. Das Verfahren nach Anspruch 23 oder 24, aulierdem umfassend entweder einen Schritt der Durchmusterung einer 
Nucleinsaure-Bibliothek mit besagten isolierten Amplifikationsprodukt oder einen Schritt der Bestimmung der Nu- 
cleotid-Sequenz mindestens eines Teils von besagtem isolierten Amplifikationsprodukt. 
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Revendications 

1. Procede pour isoler un ADN complementaire d'un ARNm dans un echantillon d'acide nucleique, comprenant les 
Stapes consistant a : 

a) mettre en contact I'echantiilon avec une premiere amorce oligonucleotidique dans des conditions dans 
lesqueiles ladite premiere amorce s'hybride avec un ARNm quelconque a un premier site possedant une 
sequence de bases complementaires ; 

b) effectuer la transcription inverse de I'ARNm en utilisant une transcriptase inverse et ladite premiere amorce 
pour produire un premier brin d'ADN complementaire d'au moins une portion de I'ARNm en amont dudit premier 
site ; 

c) mettre en contact le premier brin d'ADN avec une seconde amorce oligodesoxynucleotidique dans des 
conditions dans lesqueiles la seconde amorce s'hybride avec un ADN complementaire a un second site ; 

d) allonger la seconde amorce en utilisant une ADN polymerase pour produire un second brin d'ADN comple- 
mentaire du premier brin d'ADN en aval dudit second site ; et 

e) amplifier les premier et second brins d'ADN en utilisant une polymerase, ladite premiere amorce et ladite 
seconde amorce, pour former I'ADN complementaire ; dans lequel : 

i) ladite premiere amorce s'hybride a I'ARNm a un site qui contient une sequence signal polyA ; et/ou 

ii) ladite premiere amorce s'hybride a une portion de la queue polyadenosine (polyA) dudit ARNm et au 
moins un nucleotide non polyA juste en aval de ladite portion ; et/ou 

iii ) ladite premiere amorce s'hybride a un site contenant une sequence immediatement en aval d'un premier 
ribonucleotide A de la queue polyA des ARNm ; et/ou 

iv) ladite seconde amorce s'hybride avec une sequence de base d'au moins 6 nucleotides et contenant 
une sequence arbitrage et une sequence de base contenant une sequence Kozak. 

2. Procede selon la revendication 1 , dans lequel ladite premiere amorce : 

a) s'hybride avec un ARNm qui contient au moins deux nucleotides en amont du premier ribonucleotide A de 
la queue polyA, et adjacent a ce dernier ; 

b) contient au moins 13 nucleotides ; 

c) contient une region complementaire de polyA comprenant au moins 11 nucleotides et, en amont de ladite 
region complementaire de polyA, une region non complementaire de polyA comprenant au moins un nucleo- 
tide, la region non complementaire de polyA comprenant eventuellement au moins 2 nucleotides contigus. 

3. Procede selon la revendication 2c), dans lequel ladite region non complementaire de polyA comprend 3'-NV, ou 
V est un residu desoxyadenosine, desoxycytidine ou desoxyguanosine, et N est un residu desoxyadenosine, de- 
soxycytidine, desoxyguanosine ou desoxythymidine. 

4. Procede selon (a revendication la), dans lequel ladite premiere amorce comprend au moins 6 desoxyribonucleo- 
tides. 

5. Procede selon I'une quelconque des revendications 1 a 4, dans lequel 

a) ladite seconde amorce comprend au moins 6 desoxyribonucleotides ; ou 

b) ladite seconde amorce contient une sequence nucleotidique choisie au hasard ; ou 

c) ladite premiere ou seconde amorce contient un residu desoxyadenosine, desoxycytidine, desoxyguanosine 
ou desoxythymidine ; ou 

d) ladite premiere ou seconde amorce contient une sequence de reconnaissance d'une endonuclease de 
restriction ; ou 

e) ladite seconde amorce contient une sequence identique a une sequence contenue dans un ARNm de 
sequence connue ; ou 

f) au moins une desdites premiere ou seconde amorces comprend une pluralite d'oligodesoxynucleotides. 

6. Procede selon la revendication 1 , comprenant les etapes consistant a: 

mettre en contact I'ARNm avec la premiere amorce dans des conditions dans lesqueiles ladite premiere amor- 
ce s'hybride avec I'ARNm a un site ; 
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effectuer la transcription inverse de I'ARNm en utilisant la transcriptase inverse et ladite premiere amorce, 
pour produire le premier brin d'ADN ; 

mettre en contact le premier brin d'ADN avec la seconde amorce dans des conditions dans lesquelles la 
seconde amorce s'hybride avec le premier brin d'ADN au second site, ce qui inclut une sequence Kozak ; 
allonger la seconde amorce en utilisant une ADN polymerase pour produire un second brin d'ADN comple- 
mentaire du premier brin d'ADN en aval du site d'hybridation de ladite seconde amorce avec ledit premier brin 
d'ADN ; et 

amplifier les premier et second brins d'ADN en utilisant une ADN polymerase et lesdites premiere et seconde 
amorces. 

7. Procede selon la revendication 6, dans lequel ladite premiere amorce contient une sequence sensiblement iden- 
tique a une sequence contenue dans un ARNm de sequence connue. 

8. Procede selon Tune quelconque des revendications precedentes, dans lequel 

15 

a) ladite premiere amorce comprend au moins 9 desoxyribonucleotides, eventuellement au moins 10 
desoxyribonucleotides ; ou 

b) ladite seconde amorce comprend au moins 9 desoxyribonucleotides, eventuellement au moins 10 
desoxyribonucleotides ; ou 

20 c) ladite premiere amorce contient une sequence choisie arbitrairement de desoxyribonucleotides ; ou 

d) ladite premiere amorce ou ladite seconde amorce contient une sequence de reconnaissance d'une endo- 
nuclease de restriction ; ou 

e) au moins une desdites premiere ou seconde amorces comprend une plurality d'oligodesoxynucleotides. 

25 9. Procede de comparaison de la presence ou du taux de molecules d'ARNm individuelles dans au moins deux 
echantillons d'acides nucleiques, comprenant les etapes consistant a : 

a) se procurer un premier echantillon d'acide nucleique contenant des molecules d'ARNm et le soumettre au 
procede selon Tune quelconque des revendications 1 a 9 afin de produire une premiere population de produits 

30 d'amplification comprenant I'ADN complementaire ; 

b) se procurer un second echantillon d'acide nucleique contenant des molecules d'ARNm et le soumettre au 
procede selon Tune quelconque des revendications 1 a 9 afin de produire une seconde population de produits 
d'amplification comprenant I'ADN complementaire ; 

c) comparer la presence ou le taux des produits d'amplification individuels dans les premiere et seconde 
35 populations de produits d'amplification. 

10. Procede selon la revendication 9, dans lequel : 

a) ledit premier echantillon d'acide nucleique comprend des ARNm exprimes dans une premiere cellule et 
4 o ledit second echantillon d'acide nucleique se compose d'ARNm exprimes dans une seconde cellule ; ou 

b) ledit premier Echantillon d'acide nucleique comprend des ARNm exprimes dans une cellule a un premier 
stade de developpement et ledit second echantillon d'acide nucleique comprend des ARNm exprimes dans 
ladite cellule a un second stade de developpement. 

45 11. Procede selon la revendication 9 ou 10, dans lequel ladite premiere amorce contient une region complementaire 
de polyA comprenant au moins 11 nucleotides et, juste en aval de ladite region complementaire de polyA, une 
region non complementaire de polyA comprenant au moins un nucleotide, et eventuellement ladite region com- 
plementaire de polyA comprend au moins 11 residus thymidine contigus. 

50 12. Procede selon la revendication 11, dans lequel ladite premiere amorce comprend au moins 13 nucleotides. 

13. Procede selon la revendication 9, dans lequel la sequence nucleotidique de ladite premiere ou seconde amorce 
contient un site de reconnaissance d'une endonuclease de restriction et, eventuellement, au moins une desdites 
premiere ou seconde amorces comprend une plurality d'oligodesoxynucleotides, et Eventuellement aussi ladite 
55 pluralite d'oligonucleotides comprend une plurality de molecules d'oligodesoxynucl6otide ayant la meme sequence 

nucleotidique, ou les molecules individuelles d'oligodEsoxynucleotide dans ladite pluralite d'oligodesoxynucleoti- 
des ont differentes sequences nucleotidiques. 
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14. Procede selon Tune quelconque des revendications 9 a 13, comprenant en outre I'etape consistant a detecter une 
difference dans la presence ou le taux d'un produit d'amplification individuel dans ladite premiere population de 
produits d'amplification comparee a ladite seconde population de produits d'amplification. 

15. Procede selon Tune quelconque des revendications 9 a 14, dans lequel les etapes d'amplification comprennent 
chacune le fait de realiser une reaction en chaine par polymerase dans laquelle la concentration des dNTP est 
inferieure ou egale a environ 20u.M, et/ou les etapes d'amplification comprennent chacune le fait de realiser une 
reaction en chaine par polymerase dans laquelle la concentration des dNTP est d'environ 2u,M. 

16. Procede selon Tune quelconque des revendications 9 a 15, dans lequel I'etape de comparaison comprend la 
resolution de chacune desdites premiere et seconde populations de produits d'amplification par electrophorese 
sur gel et la comparaison de la presence ou du taux de bandes de tailles particulieres. 

17. Procede selon Tune quelconque des revendications 9 a 16, dans lequel ladite premiere cellule comprend une 
cellule oncogene et ladite seconde cellule comprend une cellule normale. 

1 8. Procede selon I'une quelconque des revendications 9 a 1 7, comprenant en outre une etape de clonage de produits 
d'amplification individuels a partir desdites premiere ou seconde populations de produits d'amplification. 

19. Procede selon I'une quelconque des revendications 9 a 18, dans lequel la seconde amorce qui s'hybride a un 
second site dans lesdits premier et second echantillons s'hybride au second site qui contient NNNRNNATGN. 

20. Procede selon I'une quelconque des revendications 9 a 19, dans lequel ladite premiere amorce a une teneur en 
GC dans la gamme d'environ 50-70%. 

21. Procede selon I'une quelconque des revendications 9 a 20, dans lequel la sequence nucleotidique de la dite pre- 
miere amorce contient une sequence sensiblement complementaire d'une sequence consensus trouvee dans une 
famille de genes. 

22. Procede selon I'une quelconque des revendications 9 a 21 , dans lequel lesdits premier et second sites sont separes 
I'un de I'autre de telle sorte qu'au moins certains produits d'amplification dans lesdites premiere et seconde po- 
pulations de produits d'amplification aient une taille dans la gamme d'environ 100-500 paires de bases. 

23. Procede selon la revendication 1 4, comprend en outre une etape d'isolement d'un produit d'amplification individuel 
dont la presence ou le taux differe dans lesdites premiere et seconde populations de produits d'amplification. 

24. Proced6 selon la revendication 23, comprenant en outre une etape de clonage dudit produit d'amplification isole 
dans un vecteur. 

25. Procede selon la revendication 23 ou 24, comprenant en outre soit une etape de criblage d'une banque d'acides 
nucleiques avec ledit produit d'amplification isole, soit une etape de determination de la sequence nucleotidique 
d'au moins une portion dudit produit d'amplification isole. 
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FIG. 2 
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